A Probabilty Neural Network for Continuous and Categorical Data
نویسندگان
چکیده
In most application of the data classifications, the data sets contain both continuous and categorical variables. In other word, multivariate data sets containing mixtures of continuous and categorical variables arise frequently in practice. This paper presents a novel Probability Neural Network (PNN) which can classify the data for both continuous and categorical input data types. The case with either continuous or categorical input variables is a special case of the mixtures of continuous and categorical input variables. Therefore, the proposed PNN can be also applied to these two special cases. Expectation Maximisation (EM) algorithm is widely used for mixture models of continuous variables, but not applicable for categorical variables. A mixture model of continuous and categorical variables is used to construct a Probability Density Function (PDF) which is the key part for the PNN. The proposed PNN has two advantages comparing with the conventional algorithms such as the Multilayer Perceptron (MLP) Neural Network. One advantage is that the PNN can produce better results comparing with the MLP Neural Network, even using the normalized input variables for the MLP. Normally, the normalized input variables generate a better result than the non-normalized input variables for the MLP Neural Network. Another advantage is that the PNN does not need the cross validation data set and does not produce the over training like the MLP neural network does. These have been proven in our experimental study. The proposed PNN can also be used to perform the unsupervised cluster analysis. The superiority of PNN in comparing the MLP neural network is demonstrated by applying them to a real-life data set, the Trauma data set which includes both continuous and categorical variables. Copyright © 2005 IFAC
منابع مشابه
Town trip forecasting based on data mining techniques
In this paper, a data mining approach is proposed for duration prediction of the town trips (travel time) in New York City. In this regard, at first, two novel approaches, including a mathematical and a statistical approach, are proposed for grouping categorical variables with a huge number of levels. The proposed approaches work based on the cost matrix generated by repetitive post-hoc tests f...
متن کاملIdentifying Flow Units Using an Artificial Neural Network Approach Optimized by the Imperialist Competitive Algorithm
The spatial distribution of petrophysical properties within the reservoirs is one of the most important factors in reservoir characterization. Flow units are the continuous body over a specific reservoir volume within which the geological and petrophysical properties are the same. Accordingly, an accurate prediction of flow units is a major task to achieve a reliable petrophysical description o...
متن کاملComparison of the decision tree, artificial neural network, and linear regression methods based on the number and types of independent variables and sample size
In this article, the performance of data mining and statistical techniques was empirically compared while varying the number of independent variables, the types of independent variables, the number of classes of the independent variables, and the sample size. Our study employed 60 simulated examples, with artificial neural networks and decision trees as the data mining techniques, and linear re...
متن کاملThe optimized model of factors effecting on the Merger and Acquisition from multiple dimensions with neural network approach.
Nowadays, firms apply the merger and acquisition strategy for gaining synergy, increasing the wealth of stockholders, economics of scales, enhancing efficiency, increasing the ability to research and develop, developing the firm and decreasing the risk. Developing an optimized model with the ability to identify the effective variables on the merger and acquisition process has a significant ...
متن کاملAdsorption of Fe (II) from Aqueous Phase by Chitosan: Application of Physical Models and Artificial Neural Network for Prediction of Breakthrough
Removal of Fe (II) from aqueous media was investigated using chitosan as the adsorbent in both batch and continuous systems. Batch experiments were carried out at initial concentration range of 10-50 mg/L and temperature range of 20–40˚C. In batch experiments, maximum adsorption capacity of 28.7 mg/g and removal efficiency of 93% were obtained. Adsorption equilibrium data were well-fitted with ...
متن کامل